


[Fandom stats tutorial] How to gather fandom data & do fandom stats

by toastystats (destinationtoast)



Series: Fandom Stats [101]
Category: Fandom - Fandom
Genre: Fanwork Research & Reference Guides, Meta, Nonfiction
Language: English
Status: In-Progress
Published: 2020-12-02
Updated: 2020-12-01
Packaged: 2021-03-09 23:28:48
Rating: General Audiences
Warnings: No Archive Warnings Apply
Chapters: 1
Words: 1,026
Publisher: archiveofourown.org
Story URL: https://archiveofourown.org/works/27804601
Author URL: https://archiveofourown.org/users/destinationtoast/pseuds/toastystats
Summary: How to find the biggest fandoms and do other fandom stats-y type things.  No math or stats background required!
Series: Fandom Stats [101]
Series URL: https://archiveofourown.org/series/60910
Comments: 16
Kudos: 38





	[Fandom stats tutorial] How to gather fandom data & do fandom stats

**Author's Note:**

> People ask me sometimes how to do the stuff I do and/or how to get started with their own fandom stats projects. Here are a few tips!

**Method 1: Copy & paste. Requires a computer, a web browser, and a spreadsheet program (e.g., Google Sheets or Excel).**

AO3 actually lists every fandom on the archive on a few different webpages:

The “All Fandoms” page unfortunately doesn’t really list all the fandoms, it just shows the top few fandoms in each category and then points to the other pages. But all the other pages shown list literally every fandom within a given category (in alphabetical order). For instance, Movies shows the name of every movie that has any fanworks on AO3:

The numbers in parentheses show how many fanworks each movie has. So we have all the data we need to find the biggest fandoms. All we have to do is copy and paste this list of Movies (along with the other fandom lists, like Anime & Manga) into a spreadsheet and then reformat it a bit. Then we’ll be able to sort the fandoms by their sizes and find the biggest ones.

We’ll keep using the Movies page as an example. Here, you want to select all the text on the webpage at once. Your browser should let you do this using some sort of command like Ctrl+A or Command+A, or using the menu: 

[ ](https://photos.app.goo.gl/co2BMt47Xr4NzNjx6)

[ ](https://photos.app.goo.gl/Q2q4wBuwedAwsHez8)

Now just copy this and paste it into a spreadsheet (I’m using Google Sheets, which is free but requires a Google Account). Your spreadsheet will initially look like a mess because the top part contains the top of the AO3 page:

[ ](https://photos.app.goo.gl/oatmd6wd8M4Bn9Bz6)

But if you scroll down, you will see the names of the movie fandoms:

[ ](https://photos.app.goo.gl/AewWo53iURZkr6f79)

You can delete the first rows before the list of fandoms starts.  


Note that if you want the top fandoms from ALL of AO3, not just the top movie fandoms, you should also copy/paste all the other fandom pages from AO3 into this spreadsheet (Anime & Manga, etc). I'm not going to show that here, but you do the same thing. 

Next we want to split the text fields up so that the fandom name and the number of fanworks are in different columns. Fortunately, Google Sheets lets you split text into separate columns. Select Column A and then go to the Data menu: 

[](https://photos.app.goo.gl/RqpEbSKzRWjbGrL47)

Because the number of fanworks is in parentheses, we can tell Google Sheets to start a new column every time it sees the open parenthesis: "(" We do this by specifying a Custom Separator (this pops up after you choose "Split text to columns"):

[ ](https://photos.app.goo.gl/eDf2QpAfVxpdFBya6)

[ ](https://photos.app.goo.gl/pQ25cDgMQ8tZoEpJ7)

[](https://photos.app.goo.gl/EZoreRfvvEoHXWNn6)

This is a bit more complicated because some of the fandom names also contain parentheses that help disambiguate which movie is meant -- e.g , "The Avengers (Marvel Movies)" or "101 Dalmatians (1961)." In those cases, the text gets split up into 3+ columns instead of 2. But that's fine - we can still line everything up.

First, sort Column B from Z->A. 

[ ](https://photos.app.goo.gl/WH7Ld4sFkaaNe8wG6)

Then sort Column C the same way. Repeat with Column D, etc, until you reach an empty column.

[](https://photos.app.goo.gl/zJpzW8XkTNBMNEea6)

This means that all the fandom names that had lots of parentheses in their names are now at the top, and those that had no parentheses in the name are now at the bottom. Now cut and paste the final numbers in each row so that they are all in the last column -- Column D in this example (you'll have to do this at least two times to get all the fanworks numbers that were in Column B, those that were in Column C, etc).

[ ](https://photos.app.goo.gl/TtceH1gbAuVRJxyy7)

Once all the fanworks numbers are in the same column, we want to remove the final parentheses. We do that by selecting this column and then using "Find and replace" to replace the ")" character with nothing:

[ ](https://photos.app.goo.gl/Lk1CXVxdHPwbYogE7)

Type ")" into the "Find" field, and Just don't type anything into the "Replace" field. Then hit the "Replace all" button. This will remove all the end parentheses. 

[ ](https://photos.app.goo.gl/kWZVgjtsC5iBQUbb8)

Now sort the final column Z->A once more. This gives us a list of all the fandoms, sorted by size and starting with the biggest ones: 

[ ](https://photos.app.goo.gl/2Y1bsVX47Z8AHq546)

(Remember, in this example it's just the Movies fandoms, but I could have copy/pasted all the fandom lists into one spreadsheet if I wanted a complete list of all AO3 fandoms.)

You can add the initial parentheses back to the intermediate rows, if you want. First, select these columns and choose Format->Number->Plain Text. 

[](https://photos.app.goo.gl/CWUUBZ2H2dFvGwyy9)

Then use Find and Replace on these same columns. Select the "Search using regular expressions" box. In the "Find" field, type "^(.)" and in the "Replace" field, type "($1" and then hit the "Replace all" button. (This looks for all the cells that are not empty and sticks a parenthesis at the beginning.) 

[](https://photos.app.goo.gl/5YKjf7FKpcZegnKP7)

Now if you want to make a graph of, say, the biggest ten fandoms, you can select the rows that you want to graph and choose Insert->Chart: 

[](https://photos.app.goo.gl/FKQpFz2cx7MG3Q5w7)

I recommend choosing the chart type "Bar chart" for this case, because it makes it easier to read the fandom names when they're lined up horizontally. 

[](https://photos.app.goo.gl/QeTfxbGhbvvWzPcv6)

Voila! A graph of the biggest AO3 fandoms! (Or just the biggest Movie fandoms, in this case -- but hopefully it's clear from the instructions above how to include all the fandoms if you want.) 

**Method 2: Use my fandom stats script. Requires a computer, and requires you to download and run some code at a command line (e.g., in Terminal on a Mac).**

I've shared some [fandom stats code](https://github.com/fandomstats/toastystats) to do stuff like the above more automatically (some of my fandom stats code is a bit out of date or buggy at this point, but this part should still work okay). The first section of my [AO3 README](https://github.com/fandomstats/toastystats/blob/master/AO3/AO3_README) explains how to download and run the script topFandoms.py to find the biggest fandoms on AO3. If you’re unused to doing anything at the command line, this may be a bit confusing or intimidating; I’m sorry I don’t have a good tutorial off-hand (others should feel free to suggest some in the comments).  


You can import the output of this script into a spreadsheet and then you'll have the lists of fandoms + numbers of fanworks without having to go through all the copy/paste/reformat steps in Method 1. 

**Notes for the Chapter:**

> I hope this made sense and was helpful! But let me know if it was confusing or didn't work, and I can try to correct/clarify the tutorial. I'm also planning to add more chapters; please feel free to request other topics.


End file.
